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A HIGH PERFORMANCE MULTIPLIER PROCESSOR 
FOR USE WITH AEROSPACE MICROCOMPUTERS 

P. E. Pierce 

Sandia National Laboratories 
Albuquerque, New Mexico 


An MC68000-based microcomputer including a hardware multiplier 
processor has been designed and prototyped for a re-entry 
vehicle navigation and control application. In this paper, 
the microcomputer is discussed with emphasis on the multiplier 
processor architecture, software control and theory of operation. 

The MC68000 CPU of the microcomputer cannot satisfy the real- 
time multiply processing requirements of a high accuracy RV 
navigator. The standalone CPU thru-put for multiply intensive 
applications is increased approximately seven times by the 
addition of a board level Hardware Multiplier Processor (HMP) . 
Although the HMP was designed for the MC68000 microcomputer, 
it can be used with any 16 or 32 bit CPU with minimal 
modifications. 

The memory mapped HMP performs 16 and 32 bit multiplications 
and can optionally add or subtract the full product to previous 
accumulator contents. The circuitry is sufficiently fast to 
allow the MC68000 running at 8 MHz to write single or double 
precision variables to the HMP using memory to memory transfers 
and perform an operation with no wait states introduced or 
overhead time for command passing. 

The result of multiply and accumulate operations may be 
transferred in its entirety or scaled by 2*30 and rounded 
automatically prior to transfer to the destination location 
specified by the CPU. Worst case CPU wait times introduced 
are: 3.3 ysec for double precision scale by 2 ~^ and round 

to sinqle precision; and 6.3 jjsec for quadruple precision scale 
by 2~30 an< j round to double precision. 

l • 

The Hardware Multiplier Processor incorporates Serial/Parallel 
Hardware Multiplier ICs, a translation PROM and address con- 
trolled logic to implement previously mentioned arithmetic 
functions. The use of serial arithmetic circuitry yields a 
processor of small physical size, low power and significant 
flexibility. The computation time of the HMP is shorter than 
most of the general memory addressing modes of the host CPU. 

The nine least significant CPU address bits in conjunction 
with the translation PROM control all HMP functions. The 
translation PROM provides the function related serial clock 
count to the clock control logic which in turn controls all 
HMP timing. 


Preceding Page Blank 


187 ' 



SANDIA AEROSPACE COMPUTER VERSION A 
CSANDAC IV) 


ARCHITECTURE 
MC68000 CPU 

32 BIT DATA AND ADDRESS REGISTERS 
56 INSTRUCTIONS 
1A ADDRESSING MODES 
MEMORY MAPPED I/O 
16 BIT DATA BUS 
16 M BYTE ADDRESS SPACE 
HARDWARE MULTIPLIER PROCESSOR (HMP) 
VECTORED INTERRUPTS 

POWER REQUIREMENTS 
+5V 3 3A TYPICAL 

PHYSICAL 

EXPANDABLE MODULAR CONSTRUCTION 
STACKABLE PIN- SOCKET INTERMODULE BUS 
17.8 CM x 15.9 CM x 1.27 CM MODULES 


SANDAC IV CPU MODULE 


MC68000 CPU 

16 K BYTE EPROM MEMORY 

16K BYTE NON-VOLATILE CMOS RAM 

POWER MONITOR & RESET CIRCUIT 


SANDAC IV I/O MODULE 


A CHANNEL OPTO- ISOLATED USART SERIAL I/O 
8 CHANNEL PRIORITY INTERRUPT CONTROLLER 
5 CHANNEL PROGRAMMABLE 16 BIT TIMER/COUNTER 
16 BIT MEMORY MAPPED <AK WORD) I/O 



SANDAC IV HMP MODULE 


MEMORY MAPPED REGISTERS AND FUNCTIONS 

SINGLE PRECISION (16 BIT) AND DOUBLE PRECISION 
(32 BIT) FUNCTIONS 

MULTIPLY WITH OPTIONAL ADD OR SUBTRACT TO PREVIOUS 
ACCUMULATOR CONTENTS 

SCALE 2 ±h AND ROUND 

OVERFLOW DETECTION RELATING TO ACCUMULATION 
ADDRESSING ERROR DETECTION 
CONTROL FUNCTIONS DERIVED FROM LATCHED ADDRESS BITS 
HOST CPU ALLOWED TO PROCEED IN PARALLEL WITH HMP 
AUTOMATIC HOLD-OFF OF HOST CPU IF HMP BUSY 


HARDWARE V JLTIPLIER PROCESSOR 
BEOCK DIAGRAM 
















HARDWARE MULTIPLIER PROCESSOR 
CONTROL BLOCK DIAGRAM 


FUNCTION 



EXAMPLE HMP ADDRESS MAPPED FUNCTIONS 


ADDRESS 




(HEX) 

FUNCTION 


/ 

FEOO 

READ-CLEAR STATUS REGISTER 


FE02 

READ/WRltE PA 



FE04 

READ/WRITE P3 



FE06 

READ/WRITE P2 



FE08 

READ/WRITE PI 



FEOA 

READ/WRITE M4 



FEOC 

READ/WRITE M3 



FEOE 

WRITE M2 



FEOE 

READ STATUS REGISTER 



FE1E 

WRITE M2, S.P. MULTIPLY 

S ADD 


FE3E 

WRITE M2, CLEAR ACCUM., 

S.P. MULT. 

& ADD 

FE5E 

„ WRITE M2, S.P. MULTIPLY 

& SUBTRACT 


FE7E 

WRITE M2, CLEAR ACCUM., 

S.P. MULT. 

S SUB. 

FE8E 

WRITE M2 



FE90 

WRITE Ml, D.P. MULTIPLY 

S ADD 










EXAMPLE HMP ADDRESS MAPPED FUNCTIONS 


ADDRESS 


(HEX) 


FFOO 

1 

1 

D.P. 

1 

FF1A 

D.P. 

FF1C 

D.P. 

FF1E 

1 

1 

D.P. 

1 

FF38 

D.P. 

FF80 

1 

Q.P. 

FFBA 

Q.P. 

FFBC 

Q.P. 

FFBE 

1 

Q.P. 

FFF8 

Q.P. 


FUNCTION 

REG x 2~ U & S.P., ROUND 

I 

I 

I 

I 

REG x 2' 1 & S.P. ROUND 
REG x 2° & S.P. ROUND 
REG x 2 1 & S.P. ROUND 

I 

I 

I 

I 

REG x 2 14 & S.P. ROUND 

REG x 2‘ 30 & D.P. ROUND 

\ 

I 

I 

REG x 2” 1 & D.P. ROUND 

REG x 2° & D.P. ROUND 

REG x 2 1 & D.P. ROUND 
» 

I 

1 

REG x 2 30 & D.P. ROUND 


FUNCTION EXECUTION TIME 


FUNCTION EXECUTION TIME 


S.P. MULTIPLY & ACCUMULATE 2.38 z/s 

D.P. MULTIPLY & ACCUMULATE 4.38 us 

D.P. SCALE 2“^ & S.P. ROUND 3.31 »s 

D.P. SCALE 2° & S.P. ROUND 2.44 ^ 

D.P. SCALE 2 W & S.P. ROUND 1.56 & 

Q.P. SCALE 2“ 30 & D.P. ROUND 6.31 ns 

Q.P. SCALE 2° & D.P. ROUND 4.44 

Q.P. SCALE 2 30 & D.P. ROUND 2.56 ms 



SAND AC IV BENCHMARK EQUATION 

A ll = B 11 C 11 + B 12 C 21 + B 13 C 31 + K 
NOTE: ALL TERMS ARE 32 BIT FIXED POINT. 

CONFIGURATION ' EXECUTION TIME 

MC68000 CPU a 8 MHZ 235 

(SUBROUTINE SOLUTION) 

MC68000 CPU a 8 MHZ + HMP 31 >js 

CHMP SOLUTION) 


BENCHMARK EQUATION MACRO INSTRUCTION SOLUTION 


A 11 ~ B 11 C 11 + B 12 C 21 + B 13 C 31 + K 


SOURCE CODE: 


LQPP K 

DPMA B lp C u 
DPMA B 12 , C 21 
DPMA B 13 , C 31 
DPSRM 6, A n 


/LOAD Q.P. CONSTANT 
/D.P. MULTIPLY & ADD 
/D.P, MULTIPLY & ADD 
/D.P. MULTIPLY & ADD 
/QUAD P. SCALE, ROUND 


MOVE 



BENCHMARK EQUATION MACRO EXPANSION 
A 11 = B 11 C 11 + B 12 C 21 + B 13 C 31 + K 


ASSEMBLER EXPANSION: 


MACRO 

MC68000 

MNEMONICS 

COMMENT 

LQPP #0, K 

MOVE . L 0, FE02 
MOVE . L K, FE06 

/LOAD Q.P. CONSTANT 

DPMA B n , C n 

MOVE . L B u , FEOA 
MOVE , L C n , FE8E 

/D.P. MULTIPLY & ADD 

DPMA Bj2> C 2 ^ 

MOVE . L B 12 , FEOA 
MOVE . L C 21 , FE8E 

/D.P. MULTIPLY & ADD 

DPMA B 13 , C 31 

MOVE . L B 13 , FEOA 
MOVE.L C 31 , FE8E 

/D.P. MULTIPLY & ADD 

DPSRM 0, A n 

MOVE.L FFBC, A n 

/QUAD P. SCALE, ROUND & MOVE 


SUMMARY 

EFFECTIVELY EXPANDS HOST CPU INSTRUCTION SET 

EASY INCORPORATION INTO ANY 16 BIT SYSTEM 

HIGH PERFORMANCE DUE TO SIMULTANEOUS DATA & COMMAND 
TRANSFER BY HOST CPU 

SERIAL ARITHMETIC APPROACH REDUCES COMPONENT COUNT 

EQUATION EXECUTION TIME PRIMARILY DEPENDENT ON CPU 
MEMORY ACCESS TIME 

, STRAIGHT FORWARD SOFTWARE CONTROL 

SINGLE /iP CPU PLUS HMP PROVIDES PERFORMANCE COMPARABLE 
TO BIPOLAR BIT-SLICE DESIGNS 
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